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ABSTRACT 


Two  truncated  forms  of  Euglena  gracilis  chloroplast 
initiation  factor  2  have  been  engineered  at  the  DNA  level 
and  expressed  at  a  high  level  In  Escherichia  coli.  The  DNA 
was  obtained  by  PCR  amplification  of  a  cDNA  encoding  a 
portion  of  IF-2chi.  The  PCR  primers  were  custom-designed  and 
added  a  restriction  site  at  each  end  for  sub-cloning  Into 
the  QIAexpress  expression  system.  This  expression  system 
provides  excellent  protein  yield  In  E.  coli  and  adds  six 
histidine  residues  which  bind  Nl-NTA  afflntly  resin. 

Two  polypeptides  were  produced.  A  30  kDa  polypeptide 
which  contains  the  highly  homologous  guanine  nucleotide 
binding  region  of  IF-2ehi  was  designed  and  has  been 
designated  the  G-domaln.  The  second  protein,  designated  IF- 
2chi  gamma,  has  been  designed  to  be  approximately  the  same 
size  (66  kDa)  as  E.  coli  IF-2  gamma.  Based  on  sodium 
dodecyl  sulfate  polyacrylamide  gel  electrophoresis  analysis, 
both  proteins  have  been  expressed  In  E.  coli,  and  production 
of  the  G-domain  was  superb.  IF-2chi  gamma  was  present  at  a 
much  lower  relative  concentration  and  Its  presence  greatly 
retarded  cell  growth. 

The  G-domaln  protein  was  also  purified  under  native 
conditions  to  approximately  85-90  %  purity.  A  four  step 
process  utilizing  affinity  chromatography  and  high-pressure 


VI 


liquid  chromatography  was  followed,  and  the  presence  of  the 
protein  was  monitored  by  SDS-PAGE. 


VII 


INTRODUCTION 


The  two  energy  producing  organelles  of  eukaryotic 
cells,  chloroplasts  and  mitochondria,  each  possesses  its  own 
genome.  They  also  contain  the  necessary  machinery  to 
transcribe  and  translate  this  genetic  information  into  RNAs 
and  proteins  (1) .  Previously  in  this  laboratory,  initiation 
factor  2  (IF-2chi)/  one  of  the  protein  factors  for  initiation 
of  translation  in  the  chloroplast  of  E.  gracilis,  was 
purified  and  characterized  (2)  .  IF-2chi  is  encoded  by  a 
nuclear  gene  and  a  portion  of  its  cDNA  has  been  cloned  (3) . 
In  this  report,  the  sub-cloning  and  expression  of  various 
truncated  forms  of  IF-2ohi  is  reported. 

Relatively  little  is  understood  about  IF-2chi/  but  IF-2 
from  the  prokaryote,  Escherichia  coli,  has  been  extensively 
studied  and  characterized.  E,  coli  IF-2  was  used  as  a 
model  for  the  study  of  IF-2chi  for  several  reasons.  First, 

E,  coli  IF-2  and  E.  gracilis  IF-25hi  share  significant 
homology  in  the  C-terminal  region  (Appendix  A) .  Second, 
chloroplasts  are  believed  to  have  arisen  in  evolution 
through  an  endosymbiotic  relationship  between  an  early 
cyanobacterium  and  an  early  eukaryotic  cell  (1) .  Thus, 
though  E.  gracilis  is  a  eukaryote,  one  would  expect  the 
translational  system  in  the  chloroplast  to  more  closely 
resemble  those  in  prokaryotic  organisms.  Third,  the 
eukaryotic  homolog  to  IF-2,  eIF-2,  is  a  heterotrimeric 
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protein  that  much  less  closely  resembles  IF-2chi  than  does 
prokaryotic  IF-2.  Therefore,  IF-2  from  E.  coli  makes  an 
excellent  model  for  this  investigation. 

E.  coli  IF-2  can  be  found  in  two  forms  in  vivo,  97.3 
kDa  and  79.7  kDa.  These  two  forms  arise  as  a  result  of  the 
presence  of  two  translational  start  sites  on  the  mRNA  (4) . 
The  two  forms  are  designated  alpha  and  beta,  respectively 
(4).  Another  form,  gamma  (65.4  kDa),  can  be  isolated  and 
appears  to  be  the  result  of  proteolytic  cleavage  during 
purification  (4) .  All  three  forms  differ  only  in  their  N- 
terminal  ends;  and  all  are  equally  active,  though  gamma 
exhibits  slightly  decreased  stability  (5) .  Furthermore, 
Laalami  et  al.  (6)  report  that  a  genetically  engineered  55 
kDa  C-terminal  fragment  of  E.  coli  IF-2  is  active  in  vivo. 

IF-2  has  a  major  role  in  the  initiation  of  translation 
in  prokaryotes.  (See  Figure  1  for  a  schematic  overview  of 
initiation  in  prokaryotes.)  This  factor  is  essential  for 
the  formation  of  the  translational  initiation  complex 
because  it  facilitates  the  binding  of  the  initiator  tRNA 
(fMet-tRNAf”®*^)  to  the  30S  ribosomal  subunit  in  the  presence 
of  mRNA  (4) .  GTP  is  also  required  for  the  formation  of  the 
initiation  complex  and  is  bound  by  IF-2.  IF-2  acts  as  a 
ribosome  dependent  GTPase  upon  the  binding  of  the  50S 
ribosomal  subunit  to  the  initiation  complex  (7) .  These 
functions  have  been  qualitatively  assigned  to  certain 
regions  in  IF-2 . 
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Figure  1:  Schematic  overview  of  initiation  of  trzmslation  in 
E, coll 


(8). 


General  Scheme  for  the  Initiation  of 
Protein  Biosynthesis  in  E.  coli 
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Based  on  homology  to  other  G-proteins  and  GDP  binding 
studies  on  engineered  forms  of  IF-2,  the  GTP  binding 
function  has  been  assigned  to  amino  acids  392-540  of  the  890 
amino  acid  alpha  form  of  IF-2  (4) .  Also,  based  on  the 
ability  of  the  55  kDa  protein  to  sustain  growth  in  E.  coli 
when  supplied  in  excess,  the  fMet-tRNAf”"*^  binding  function 
has  been  assigned  downstream  of  residue  540.  Thus,  the 
first  42.8  kDa  of  the  alpha  form  is  not  required  for  cell 
viability  when  the  C-terminal  portion  of  the  factor  is 
available  in  elevated  amounts.  (4) . 

With  this  understanding  of  E.  coli  IF-2  as  a  frame  of 
reference,  it  is  appropriate  to  examine  what  is  known  about 
IF-2<,hi  from  E.  gracilis .  IF-2chi  has  been  isolated  in  two 
main  forms,  designated  alpha  and  beta,  based  on  their 
affinity  for  DEAE-cellulose  (2) .  The  alpha  form  can  be 
divided  further  into  alpha-I  and  alpha-II.  The  alpha-I 
forms  of  IF-2chi  have  molecular  masses  ranging  from  700-800 
kDa  and  are  composed  of  tetramers  of  200  kDa  subunits.  The 
alpha-II  form  of  IF-2chi  has  a  molecular  mass  in  the  range  of 
200-400  kDa  and  is  a  multimer  of  97  and  110  kDa  subunits. 

The  beta  form  has  a  molecular  mass  of  200  kDa  and  is 
composed  of  97  kDa  subunits  (2)  .  Though  IF-2chi  is  vastly 
different  in  size  and  in  its  state  of  aggregation  from  E. 
coli  IF-2,  its  biological  function  is  quite  similar. 

All  three  forms  of  IF-2chif  like  E.  coli  IF-2,  promote 
fMet-tRNAf”®*^  binding  to  chloroplast  30S  ribosomal  subunits 
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in  a  message-dependent  manner  (2) .  As  in  prokaryotic 
initiation,  this  step  requires  GTP  (2)  .  IF-2chi  is  not 
active  on  E.  coll  ribosomes;  but  IF-2  from  E.  coli  is  active 
with  chloroplast  ribosomes  (2) .  Finally,  it  is  not  known  if 
the  97,  110,  or  200  kDa  subunits  of  the  various  forms  of  IF- 
2<,hi  are  active  as  monomers  or  how  they  originate.  Based  on 
the  presence  of  intermediate  sizes  of  the  protein  in  sodium 
dodecyl  sulfate  polyacrylamide  gel  electrophoresis  (SDS- 
PAGE)  and  Western  blot  experiments,  as  well  as  rtiRNA 
analysis,  (3)  it  is  believed  that  this  protein  is  probably 
made  as  a  200  kDa  monomer  and  proteolytically  processed  in 
the  cell  (2) . 

Part  of  the  cDNA  derived  from  the  nuclear-encoded  gene 
for  IF-2chi  has  been  cloned  and  sequenced  in  this  laboratory 
resulting  in  several  interesting  facts.  First,  a  Northern 
blot  showed  an  mRNA  approximately  6.5  kb  in  length  (3)  which 
could  correspond  to  a  protein  over  200  kDa  in  size.  Using  a 
random-primed  cDNA  library  2850  base  pairs  of  the  cDNA  have 
been  cloned  and  sequenced  from  the  probable  center  of  the 
gene  to  the  poly (A)  tail  (9)  including  a  2466  nucleotide 
open  reading  frame  (Appendix  B) .  The  translated  protein 
sequence  of  the  region  of  IF-2chi  characterized  to  date  is 
homologous  to  E.  coli  IF-2,  especially  in  the  C-terminal 
half  of  the  protein.  The  overall  identity  is  38  % 

(Appendix  A)  .  IF-2phi  is  75  percent  identical  to  E.  coli  IF- 
2  in  the  G-domain  region  (residues  392-540;  Appendix  A) . 
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Two  polypeptides,  corresponding  to  different  regions  of 
IF-2ci,i,  have  been  engineered  at  the  DNA  level  and  expressed 
in  E.  coli.  One  polypeptide  represents  the  G-domain  and 
some  surrounding  residues  of  IF-2chi-  The  second  is  a 
protein  designed  to  be  the  same  size  as  the  E.  coli  gamma 
form  of  IF-2.  This  protein  will  be  referred  to  as  IF-2chi 
gamma.  The  partial  purification  of  the  G-domain  protein  has 
also  been  accomplished. 
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MATERIALS  AND  METHODS 


Materials :  Taq  DNA  polymerase,  T4  llgase,  restriction 
enzymes  and  appropriate  reaction  buffers  were  obtained  from 
Promega.  The  SeqLEnd  oligonucleotide  was  synthesized  by  the 
Pathology  Department  at  the  University  of  North  Carolina  at 
Chapel  Hill  (UNC-CH)  and  the  oligonucleotides  5-gamma  and  3- 
gdomain  were  synthesized  in  the  Lineberger  Comprehensive 
Cancer  Center  at  the  UNC-CH  (Figure  2) .  Xanamycin, 
ampicillin,  isopropyl  p-D-thiogalactopyranoside  (IPTG) , 
alumina  A305/  ethidium  bromide,  and  2- (N-morpholino) - 
ethanesulfonic  acid  (MES)  were  purchased  from  Sigma.  Nickel 
nitrilotriacetic  acid  (Ni-NTA)  resin,  E.  coli  M15,  and  pQE 
vectors  were  obtained  from  QIAGEN  as  part  of  a  QIAexpress 
kit.  A  GENECLEAN  kit  for  DNA  isolation  was  purchased  from 
BIO  101,  and  SDS-PAGE  pre-stained  molecular  weight  markers 
as  well  as  reagents  for  protein  determination  by  the 
Bradford  method  were  obtained  from  Bio-Rad.  Tris, 
P-mercaptoethanol  (^Me) ,  ethylenediaminetetraacetic  acid 
(EDTA) ,  and  agarose  were  purchased  from  Fisher  and  HEPES 
from  Mallinckrodt .  High-pressure  liquid  chromatography 
(HPLC)  columns  TSKgel  DEAE-5PW  (7.5  mm  x  75  mm)  and  SP-5PW 
(7.5  mm  x  75  mm)  were  purchased  from  Beckman.  IF-2chi  clones 
in  a  pBluescript  plasmid  were  kindly  provided  by  Dr.  Lan  Ma 
who  had  done  the  preliminary  cloning  work. 
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Figure  2:  PCR  Prlaara.  The  nucleotide  sequences  of  the 
three  oligonucleotide  primers  used  In  the  PCR  reactions  are 
shown.  The  primers  5-gamma  and  3-gdomain  were  used  for  the 
G-domain  construct.  The  gamma  construct  was  synthesized 
with  5-gamma  and  SeqLEnd.  Bases  In  italics  are  recognition 
sites  for  restriction  enzymes.  Bases  in  bold  lettering  base 
pair  with  bases  in  the  template  DNA. 


Nucleotide  Sequences  of  Oligonucleotide  Primers 


Name 

5-gamma 

3-gdomain 

SeqLEnd 


Sequence 

5'  GACGGTACCmGGGATGACCGTCAGCQIUS  3' 
5'  GKCAGATCmGGTSGGCCTTCMGITCCGC  3' 
5'  GCCAAGATCIGCGGGGGTCGCCTCGAOGG  3' 


Buffers :  Buffer  I  consisted  of  20  mM  Tris-HCl  pH  7.8#  50  mM 
NH^Cl,  10  mM  MgClj,  6  mM  ^Me#  and  10  %  glycerol.  Buffer  II 
contained  20  mM  HEPES-KOH#  pH  7.6#  50  mM  NH4CI#  0.1  mM  EDTA# 

6  mM  PMe#  and  10  %  glycerol.  Buffer  III  was  identical  to 
Buffer  II  except  that  it  contained  300  mM  NH4CI.  Buffer  IV 
contained  20  mM  MES-KOH#  pH  6.0#  50  mM  NH4CI#  0.1  mM  EDTA# 

6  mM  ^Me#  and  10  %  glycerol.  Buffer  V  was  similar  to  Buffer 
IV  except  that  it  contained  300  mM  NH4CI. 

PCR  Reactions;  Plasmid  DNA  for  the  polymerase  chain  reaction 
(PCR)  template  was  a  pBluescript  vector#  designated  T5# 
containing  part  of  the  IF-2chi  cDNA  clone  from  residue  580  to 
base  2466  (Appendix  B) .  PCRs  were  carried  out  in  0.1  ml 
reaction  mixtures  with  2.5  units  of  Taq  DNA  polymerase#  50 
pmol  of  each  oligonucleotide  primer  (Figure  2) #  and 
approximately  50  fmol  of  plasmid  DNA.  The  reactions  were 
initiated  after  a  5  minute  denaturation  at  85  °C  at  a  MgClj 
concentration  of  1.5  mM.  A  Perkin  Elmer  Cetus  thermal 
cycler  was  used  for  the  reactions  and  programmed  as  shown  in 
Figure  3.  PCR  results  were  analyzed  on  1  %  agarose  gels  and 
stained  with  ethidium  bromide. 

Lioation/Transformation:  Aliquots  (12  -  20  ul)  of  the  PCR 
reaction  mixtures  were  removed  and  added  to  0.3  ml  of  Nal 
solution  (GENECLEAN  kit) .  The  PCR  product  was  isolated 
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Figur*  3:  Program  for  PCR.  The  thermal  cycler  was 
programmed  as  shown  for  PCR.  Steps  3  through  5  were 
repeated  for  30  cycles.  Annealing  temperatures  for  the  G- 
domain  and  gamma  constructs  were  50  and  60  °C.  respectively. 


Program  for  PCR  Reactions 


Step 

Time  (min) 

T^mp  (°g) 

1 

5 

85 

2 

1 

94 

3  (denature) 

0.75 

94 

4  (anneal) 

1 

50  or  60 

5  (extend) 

2.5 

72 

6 

5 

72 

7 


N.A. 
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using  the  GENECLEAN  method  based  on  the  manufacturer's 
instructions.  DNA  (1  ug)  was  digested  with  Bglll  and  Kpnl 
and  the  DNA  fragments  present  were  separated  by 
electrophoresis  on  a  1  %  agarose  gel  in  40  mM  Tris  acetate 
with  10  mM  EDTA  (TAE) .  Bands  containing  the  desired  DNA 
were  excised  from  the  gel  and  GENECLEANed. 

A  pQE-52/17  Type  III  expression  construct  (10)  was 
kindly  provided  by  Qiong  Lin.  It  contained  one  of  her 
inserts  cloned  between  the  polylinker  (located  on  the  pQE-52 
portion  of  the  plasmid)  and  the  Bglll  site  (located  on  the 
pQE-17  portion  of  the  plasmid)  (10) .  The  plasmid  was 
digested  with  Bglll  and  Kpnl  (there  is  a  Kpnl  recognition 
site  in  the  polylinker) ,  and  the  DNA  fragments  were 
separated  by  electrophoresis  on  a  1  %  agarose  gel  in  TAE. 

The  vector  fragment  was  excised  and  eluted  using  GENECLEAN. 
For  the  ligation  reaction,  a  10  ul  reaction  mixture  was 
prepared  and  contained  about  1  pmol  of  insert  and  0.5  pmol 
of  vector.  Incubation  was  carried  out  with  4  units  of  T4 
DNA  ligase  and  1  mM  ATP  at  16  °C  for  14  hours. 

E.  coli  M15  carrying  the  pREP4  plasmid  was  transformed 
by  high  efficiency  electro-transformation  on  a  Bio  Rad  Gene 
Pulser  with  3  ul  of  the  ligation  reaction  according  to  the 
manaufacturer' s  instructions.  After  incubation  in  SOC  media 
(LB  with  2.5  mM  KCl,  10  mM  MgS04,  and  20  mM  glucose)  at  37 
°C  for  1  hour,  the  cell  suspension  was  plated  on  LB-agar 
containing  100  ug/ml  ampicillin  and  25  ug/ml  kanamycin. 


Colonies  were  picked  and  plated  on  a  master  plate.  Colonies 
were  screened  for  plasmids  of  interest  (based  upon  the  size 
of  the  insert) ,  and  frozen  cell  stocks  were  made  of  those 
strains  carrying  plasmids  of  interest.  Plasmids  were 
isolated  and  sequenced  by  the  Sanger  dideoxy  method  (11)  to 
insure  the  correct  reading  frame  was  maintained  in  the 
construct . 

Small-Scale  Expression:  A  time  course  for  the  expression  of 
various  constructs  was  carried  out  basically  as  described  by 
QIAGEN  (10)  using  1  mM  IPTG  for  induction.  A  20  ml  culture 
was  used  and  1.5  ml  of  the  cells  were  removed  at  0.25,  0.5, 
1,  2,  3,  and  4  hours  for  harvesting  and  freezing.  The 
samples  were  analyzed  by  12  %  SDS-PAGE. 

Purification  of  the  G-Domain;  One  liter  of  E.  coli  M15 
carrying  the  construct  of  interest  was  grown  as  described 
(10)  and  induced  with  1  mM  IPTG.  Cells  were  harvested  after 
4  hours  of  induction  and  collected  by  centrifugation  in  a 
Sorvall  H6000A  rotor  at  4500  rpm  and  4  °C.  The  cell  pellet 
was  weighed,  ground  with  2  times  the  cell  weight  of  alumina 
A305,  and  washed  into  centrifuge  tubes  with  2-3  volumes  of 
Buffer  I.  The  lysate  was  subjected  to  centrifugation  at 
10,000  rpm  at  4  °C  in  a  Sorvall  SS-34  rotor  for  10  minutes. 
The  supernatant  (S30)  was  transferred  to  a  fresh  tube  and 
subjected  to  centrifugation  at  15,000  rpm  for  30  minutes  in 
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a  Sorvall  SS-34  rotor  at  4  ®C. 


The  S30  was  stirred  with  4  ml  of  Ni-NTA  resin 
previously  equilibrated  with  Buffer  I  for  1  hour  at  4  °C. 

The  slurry  was  poured  into  a  column  and  washed  with  Buffer  I 
until  the  Ajeo  decreased  to  baseline.  The  column  was  then 
washed  with  Buffer  I  which  had  been  adjusted  to  pH  6.0  by 
addition  of  HCl  until  the  Ajgo  decreased  to  baseline. 

Protein  retained  by  the  resin  was  eluted  using  a  20  ml 
linear  gradient  from  0  to  0.5  M  imadazole  in  Buffer  I. 
Fractions  of  2  ml  were  collected  at  a  flow  rate  of  0.5 
ml/min.  Fractions  were  analyzed  by  SDS-PAGE  and  those 
containing  the  G-domain  protein  were  pooled. 

Pooled  fractions  were  dialyzed  against  Buffer  II  and 
applied  to  a  TSKgel  DEAE-5PW  HPLC  column  previously 
equilibrated  with  Buffer  II.  The  column  was  washed  at  1 
ml/min  until  the  Ajeo  returned  to  baseline.  The  column  was 
developed  with  a  linear  gradient  (30  ml)  of  0  to  100  % 

Buffer  III  at  a  flow  rate  of  0.5  ml/min  while  collecting  1 
ml  fractions.  Fractions  were  analyzed  and  pooled  as 
described  above.  The  pooled  fractions  were  buffer-exchanged 
with  Buffer  II  using  a  Centricon  10  concentrator.  The 
protein  was  again  applied  to  the  column  above  and  washed  as 
described.  The  column  was  developed  with  the  profile  shown 
in  Figure  4,  and  1.5  minute  fractions  were  collected. 
Fractions  were  analyzed  and  pooled  as  described  previously. 
The  pooled  fractions  were  dialyzed  against  Buffer  IV  and 
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Figure  4:  Elution  Gradient  Profile  for  2nd  'JSAB  HPLC  Column. 
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applied  to  a  TSKgel  SP-5PW  HPLC  column  previously 
equilibrated  with  Buffer  IV  at  a  flow  rate  of  1  ml/min.  The 
column  was  washed  with  Buffer  IV  until  the  Ajso  reached 
baseline.  The  protein  was  eluted  with  a  linear  gradient  (60 
ml)  of  0  to  100  %  Buffer  V,  and  0.5  ml  fractions  were 
collected.  Fractions  were  analyzed  by  SDS-PAGE  as  described 
above.  Fractions  were  fast-frozen  at  -70  °C  using  a  2- 
propanol/dry  ice  bath  between  each  of  the  above  steps. 

Miscellaneous  procedures:  The  Laemmli  procedure  for 
SDS-PAGE  was  followed  using  12  %  polyacrylamide  resolving 
gels  (11) .  Gels  were  stained  with  Coomassie  Brilliant  Blue. 
Bradford  assays  were  carried  out  according  to  reagent 
manufacturer' s  instructions  with  bovine  serum  albumin  as  the 
standard. 
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RESULTS 


Construction  of  Expression  Plasmids:  As  mentioned,  the  pQE- 
52/17  expression  vector  had  already  been  constructed  in  our 
laboratory.  This  expression  system  provides  a  high  level  of 
expression  and  incorporates  a  6  residue  histidine  tag  on  the 
C-terminal  end  of  the  polypeptide  to  aid  in  purification 
(10) .  In  order  to  simplify  the  construction  of  the 
expression  plasmids  PCR  primers  were  designed  with  three 
criteria.  First,  they  had  to  yield  a  product  that  could  be 
ligated  into  pQE-52/17  and  maintain  the  proper  reading 
frame.  Second,  primers  from  the  5'  end  of  the  IF-2chi 
sequence  had  to  have  a  Kpnl  site  engineered  on  the  5'  end  of 
the  primer.  Similarly  primers  from  the  3'  end  of  the  IF-2ehi 
sequence  had  to  have  a  Bglll  site  engineered  on  the  5'  end 
of  the  primer  (Figure  2) .  Third,  the  primers  had  to  be 
designed  such  that  they  would  provide  products  coding  for 
IF-2ehi  G-domain  and  IF-2chi  gamma.  The  resulting  primers  are 
shown  in  Figure  2  and  functioned  properly  as  described 
below. 

The  G-domain  insert  was  made  using  5-gamma  and 
3-gdomain  primers.  The  5-gamma  primer  defines  the  5'  end  of 
both  constructs,  and  when  a  lineup  based  upon  homology  of  E. 
coli  IF-2  and  IF-2chi  is  examined,  the  primer  corresponds  to 
the  IF-2chi  sequence  that  is  homologous  to  the  start  of  E. 
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coli  lF-2  gamma.  The  3-gdomain  primer  corresponds  to  the  3' 
end  of  the  guanine  nucleotide  binding  domain.  IF-2chi  gamma 
insert  was  made  using  5-gamma  and  SeqLEnd  primers.  SeqLEnd 
base  pairs  in  the  region  the  5'  side  of  the  stop  codon  in 
the  24  66  residue  open  reading  frame  for  IF-2chi. 

The  polymerase  chain  reaction  for  the  G-domain  produced 
approximately  8-10  ug  of  the  expected  780  base  pair  product 
with  no  evidence  of  side  reactions  (Figure  5) .  Amplifying 
the  DNA  necessary  for  the  IF-2chi  gamma  product  proved  more 
difficult.  Attempts  were  made  at  several  annealing 
temperatures  from  50  -  65  °C.  before  the  results  shown  were 
obtained  (Figure  6) .  The  reaction  produced  8-10  ug  of  the 
expected  1762  base  pair  product,  and  there  was  no  evidence 
of  spurrious  products  (Figure  6) . 

Confirmation  of  the  Sequences  of  the  Expression  Constructs: 
The  plasmids  containing  the  G-domain  and  gamma  constructs 
were  sequenced  as  described  in  Materials  and  Methods 
(results  not  shown) ,  using  custom  designed  oligonucleotide 
primers  originally  used  to  sequence  the  cDNA  for  IF-2chi  (9)  . 
The  sequences  of  the  inserts  matched  prior  sequences 
(Appendix  B) ,  and  the  ligated  regions  were  in-frame  with  the 
start  site  provided  by  the  pQE  vector  (10) .  Complete 
sequences  of  the  expressed  proteins,  based  on  the  DNA 
sequence,  are  shown  in  Figure  7  and  Figure  8.  The  sequences 
represent  the  entire  expressed  protein,  including  any 
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Figure  5:  Blectrophoiretlc  Analysis  of  PCR  Asipllfloation  of 
6-doBiain  Sequences.  The  photograph  of  an  ethldlum  bromide 
stained  agarose  gel  shows  the  results  of  the  PCR  used  to 
produce  the  DNA  for  the  G-domain  construct.  Lane  1  -  200  ng 
of  high  molecular  weight  standards  (lambda  DNA  digested  with 
Hindlll) .  Lane  2  -  200  ng  of  low  molecular  weight  standards 
(pEC1009  digested  with,Sau3A).  Lane  3  -  Lane  left  blank 
Lane  4  -  A  10  uL  aliquot  taken  from  the  PCR  reaction  tube 
for  the  G-domain  construct.  The  PCR  reaction  was  carried 
out  as  described  in  Materials  and  Methods. 
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rigur*  €:  ll«ctrophor«tle  Analysis  of  PCR  Aaipllfiestion  of 
IF-2au  Omm  Ssqusncos.  The  photograph  of  an  ethidlum 
bromide  stained  agarose  gel  shows  results  of  the  PCR  used  to 
produce  the  DNA  for  the  gamma  construct.  Lane  1  -  200  ng  of 
high  molecular  weight  markers  (lambda  DNA  digested  with 
Hindlll) .  Lane  2  -  200  ng  of  low  molecular  weight  markers 
(pEZClOOS  digested  with  Sau3A)  .  Lane  3  -  A  10  uL  aliquot 
taken  from  PCR  reaction  tube  1  for  the  gamma  construct. 

Lane  4  -  A  10  uL  aliquot  taken  from  PCR  reaction  tube  2  for 
the  gamma  construct.  The  PCR  reaction  was  carried  out  as 
described  in  Materials  and  Methods. 


Flgur«  7:  S«qu«nc«  of  G-Dooain  Protoin.  The  amino  acid 
sequence  for  the  entire  G-domain  protein  as  it  is  expressed 
is  depicted  in  single  letter  code.  Residues  coded  by  IF-2ehi 
DNA  are  underlined. 


1  MRIRMAARYL  GMTVSEIAGK  LAITPANWT  VLFKKGIMSA  PSQTIAYDLV 
51  KIVCDEYKVE  VLEVEEEDGI  ASMEDRFVLD  EEAEALVSRP  PWTIMGHVD 
101  HGKTSLLDYI  RKSNWAGEA  SGITOAIGAY  HVEFASPTDG  TPTFISFIDT 
151  PGHEAFTAMR  ARGATVTDIT  IIWAADDGV  RPQTKEAIAH  CKAAGVPMW 
201  AINKIDKDGA  DPERVMNELA  OAGLVPEEWG  GEVPTVKISA  KKGLGIKELL 


251  EMILLTAEVA  DLKANLDLRS  HHHHHH* 


Figure  8:  Sequence  of  IF-2ehi  Ganma  Protein.  The  amino  acid 
sequence  for  the  entire  IF-2chi  gamma  protein  as  it  is 
expressed  is  depicted  in  single  letter  code.  Residues  coded 
by  IF-2chi  DNA  are  underlined. 


1  MRIPMRRRYL  GMTVSEIAGK  LAITPANWT  VLFKKGIMSA  PSQTIAYDLV 
51  KIVCDEYKVE  VLEVEEEDGI  ASMEDRFVLD  EEAEALVSRP  PWTIMGHVD 
101  HGKTSLLDYI  RKSNWAGEA  SGITOAIGAY  HVEFASPTDG  TPTFISFIDT 
151  PGHEAFTAMR  ARGATVTDIT  IIWAADDGV  RPOTKEAIAH  CKAAGVPMW 
201  AINKIDKDGA  DPERVMNELA  OAGLVPEEWG  GEVPTVKISA  KKGLGIKELL 
251  EMILLTAEVA  DLKANPAAPA  EGTVIEAYLD  RTRGPVATVL  VQNGTLRAGD 
301  VWTNATWGR  VRAIMDEKGA  MLEAAPPSLP  VOVLGLDDVP  AAGDKFEVYA 
351  SEKEARDKVD  EFERTKKEKN  WASLASRDLV  RLDNNADGKG  LEVMNVILKT 
401  DVSGSCEAIR  AALDTLPOTK  lELRLILASP  GDITVSDVNL  AASTGSIILG 
451  FNVDTFSAAE  ALIKNLGIKC  MTFDVIYDLV  DOMKAVMEGK  LGDEQIPEKA 
501  GEAEVKAVFA  ARNGKKAAGC  LWAGRLVAP  AFIEVLRKKK  ILFSGQLFOL 
551  RRMKDNVSEV  GTDTECGVTL  DDFDDWQEGD  RIVCYSTVTR  ORALEATPAD 


601  LRSHHHHHH* 


residues  which  were  added  by  the  pQE  vector,  such  as  the 
start  codon,  residues  encoded  in  the  polylinker,  and  the  6X 
histidine  tag.  The  calculated  molecular  masses  for  these 
polypeptides  are  29.7  kilodaltons  (kDa)  for  the  G-domain  and 
65.7  kDa  for  the  gamma  protein. 

Growth  Rates  of  Strains  Containing  Expression  Plasmids;  The 
growth  rates  of  E.  coli  carrying  the  expression  constructs 
was  monitored  during  small-scale  expression  experiments  and 
provided  interesting  preliminary  results.  The  G-domain 
protein  had  no  effect  on  the  growth  rate  of  E.  coli  M15  as 
shown  by  Figure  9.  In  contrast,  the  growth  rate  of  cells 
expressing  the  IF-2chi  gamma  protein  was  drastically  reduced 
(Figure  10) . 

Small-Scale  Expression  of  the  G-domain  Protein:  Lysis  under 
denaturing  conditions  and  batch  purification  using  Ni-NTA 
resin  of  the  G-domain  resulted  in  separation  of  a  single 
protein  from  the  cellular  lysate  (Figure  11) .  Only  one 
protein  bound  the  Ni-NTA  resin,  and  that  protein  was  not 
present  prior  to  induction  with  IPTG  at  time  =  0.  The 
concentration  of  the  protein  slowly  increased  over  time 
after  induction.  (Figure  11) .  The  apparent  size  of  the 
protein  from  SDS-PAGE  is  about  38  kDa  which  is  relatively 
close  to  the  calculated  molecular  mass  of  29.7  kDa.  Also, 
Qiong  Lin,  of  this  laboratory,  observed  smaller  polypeptides 


30 


Figure  9:  Growth  Curve  for  E.  aoll  MIS  Carrying  the  G-Dooiain 
Construct.  The  plot  illustrates  the  growth  rate  of  E.  coli 
M15  containing  the  G-domain  construct,  either  induced  with  1 
mM  IPTG  or  uninduced. 


Absorbance  (600  nm) 


Figure  10:  Growth  Curve  for  M.  coll  M15  Carrying  the  Gaane 
Construct.  The  plot  illustrates  the  growth  rate  of  E.  coli 
Ml 5  containing  the  gamma  construct,  either  induced  with  1  mM 
IPTG  or  uninduced. 
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Flgurtt  11:  SDS-PA6B  Analysis  of  tha  Small-Seals  Bs^rssion  of 
the  6-Domaln  Protsln.  Lane  1  -  Bio  Rad  pre-stained 
molecular  weight  standards.  Lane  2  -  Protein  bound  by 
Ni-NTA  resin  at  time  =  0  (prior  to  induction) .  Lanes  3-6  - 
Protein  bound  by  Ni-NTA  resin  at  time  =  1,  2,  3,  and  4 
hours,  respectively,  after  induction.  Lanes  7-10  -  Unbound 
protein  at  time  =0,  1,  3,  and  4  hours,  respectively. 
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with  the  6X  His  tag  migrating  slower  than  expected  in  SDS- 
PAGE  experiments  (personal  communication) . 

Small-Scale  Expression  of  the  IF-2,..  Gamma  Analysis 

of  the  Ni-NTA  bound  fraction  from  the  lysate  of  the  small 
scale  expression  of  the  gamma  protein  revealed  a  single 
protein  with  an  apparent  molecular  mass  of  65  kDa  (Figure 
12) .  This  protein  was 'not  present  at  time  =  0,  prior  to 
induction.  A  faint  band  can  be  seen  as  early  as  15  minutes 
after  induction  (Figure  12) .  Unlike  the  G-domain  protein, 
the  relative  concentration  of  IF-2chi  gamma  decreased  after  1 
hour  instead  of  steadily  increasing  (Figure  12) .  Analysis 
of  the  unbound  proteins  before  and  after  induction  shows  the 
appearance  of  a  65  kDa  protein  after  induction  (Figure  12) . 

Purification  of  the  G-Domain:  The  expression  experiments 
described  above  were  carried  out  on  cell  extracts  prepared 
under  denaturing  conditions.  In  order  to  obtain  the  IF-2chi 
G-domain  in  a  native  conformation,  a  purification  scheme  was 
developed.  This  four-step  process  started  with  thousands  of 
contaminants  and  resulted  in  a  preparation  estimated  to  be 
85-90  %  pure. 

The  first  step  in  the  purification  procedure,  using  the 
Ni-NTA  affinity  column  removed  many  of  the  cellular 
proteins,  but  the  G-domain  preparation  was  clearly  not  pure. 
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Figure  12:  SDS'-PAGl  Analysis  of  ths  SsMll-Sesls  Ixprssslon 

of  IF-2q,^  Ganma.  Lane  1  -  Bio  Rad  pre-stained  molecular 
weight  standards.  Lane  2  -  Proteins  bound  to  Ni-NTA  resin 
at  time  =  0  (prior  to  induction) .  Lanes  3-8  -  Protein  bound 
to  Ni-NTA  resin  at  times  =  0.25,  0.5,  1,  2,  3,  and  4  hours, 
respectively,  after  induction.  Lane  9  -  Unbound  proteins  at 
time  =  0.  Lane  10  -  Unbound  proteins  at  time  =  0.5  hours. 


(Figure  13) .  In  Lane  2  (Figure  13)  the  large  band  at  about 
45  kDa  Is  EF-Tu,  and  the  somewhat  bright  band  below  It  at 
38~40  kOa  Is  the  G-domaln.  As  Is  evident  from  the  gel,  the 
G-domaln  Is  greatly  enriched  In  the  eluted  fractions  (Lanes 
5-8)  and  almost  all  of  the  EF-Tu,  one  of  the  most  abundant 
proteins  in  E.  coli,  was  separated  away  (Figure  13) . 

However,  hundreds  of  contaminants  still  remained  after  the 
Ni-NTA  column  (Figures  13  and  14) .  A  majority  of  the 
G-domain  eluted  in  the  first  10  ml  of  the  20  ml  gradient, 
and  just  small  amounts  could  be  seen  after  fraction  7 
(Figures  13  and  14) . 

The  first  TSKgel  DEAE-5PW  HPLC  column  provided  much 
better  separation  and  approximately  half  of  the  contaminants 
present  after  the  Ni-NTA  column  were  removed  (Figure  15) . 
Retention  of  the  G-domain  by  the  DEAE-5PW  column  was 
excellent.  About  halfway  through  the  gradient  there  was  a 
strong  absorbance  at  280  nm  (results  not  shown)  which 
corresponded  with  the  appearance  of  the  G-domain  on 
polyacrylamide  gels  (Figure  15) .  The  first  strong  band  for 
the  G-domain  (Figure  15,  Lane  11)  was  in  fraction  26  at  an 
approximate  NH^Cl  concentration  of  175  mM. 

The  second  DEAE-5PW  HPLC  column  produced  even  better 
results  than  the  first  (Figures  16  and  17) .  The  drawn  out 
gradient  and  excellent  retention  produced  very  good 
separation  leaving  the  G-domain  as  the  major  protein 
constituent  in  4  fractions  with  as  few  as  six  unwanted 
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Figure  13:  SDS-PA6S  Analysis  of  Ni-MTA  Colusm  Fractions  for 
the  G-Oonain.  Lane  1  -  Bio  Rad  pre-stained  molecular 
weight  standards.  Lane  2  -  Flow-thru  from  the  Ni-NTA 
column.  Lane  3  -  Wash  with  pH  6.0  Buffer  I.  Lanes  4-10  - 
Imidazole  eluted  fractions  3-9,  respectively.  (The  intense 
band  with  an  apparent  molecular  mass  of  38  kDa  in  lanes  3-8 
is  the  G-domain  protein.) 


Flgura  14:  SDS-PA6B  Analysis  of  Additional  Mi-NTA  Colmn 
Fractions  for  tha  6->DoaAin.  Lane  1  -  Bio  Rad  pre-stalned 
molecular  weight  standards.  Lanes  2-10  -  Eluted  fractions 
11-18,  respectively.  Analysis  of  earlier  coliomn  fractions 
is  shown  in  Figure  13.  (The  G-domain  protein  is  the  protein 
with  an  apparent  molecular  mass  of  38  kDa  in  lanes  2-8.) 


Figure  15:  SDS-PAGB  Analysis  of  Fractions  from  First  TSKgel 


DEAB-5PW  HPLC  Column  for  the  6-Domaln.  Lane  1  -  Bio 
pre-stained  molecular  weight  standards.  Lane  2-10  - 
fractions  29,  30,  31,  32,  34,  36,  37,  39,  and  43, 
respectively.  Lanes  11-13  -  Column  fractions  26-28. 
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Figure  16:  SDS-PAGE  Analysis  of  Fractions  frost  Second  TSKgel 
DEAS-5PH  HPLC  ColuBut  for  the  G-Donain.  Lane  1  -  Bio  Rad 

pre-stained  molecular  weight  standards.  Lanes  2-10  - 


Fractions  74-82. 
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Figure  17:  SDS-PAOI  Analyais  of  Fractions  froa  Second  TSXgal 
DEAE-5PW  HPLC  Column  for  the  6-Doaaln.  Lane  1  -  Bio  Rad 

pre-stained  molecular  weight  standards.  Lanes  2-10  - 
Fractions  83-91.  Analysis  of  earlier  fractions  in  the 
gradient  are  shown  in  Figure  16. 


proteins  present  in  lower  amounts  (Figures  16  and  17) . 

The  TSKgel  SP-5PW  HPLC  column  provided  little 
additional  purification  of  the  G-domain.  Very  little  of  the 
protein  was  retained  by  the  colvunn  as  indicated  by  the 
absorbance  monitor  and  SDS-PAGE  analysis  of  the  flow-thru 
and  fractions  (results  not  shown) .  In  fact,  the  bands  on 
the  gel  were  too  faint  to  photograph  and  are  not  shown.  The 
G-domain  that  was  retained  on  the  column  appeared  slightly 
more  pure,  with  only  four  other  proteins  present.  However, 
so  little  of  the  G-domain  protein  remained  that  the  further 
purification  of  this  particular  preparation  was  abandoned. 
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DISCUSSION 


Two  truncated  forms  of  IF-2chi  have  been  engineered  at 
the  DNA  level  and  expressed  in  E.  coli.  These  proteins, 
designated  the  G-domain  and  IF-2«hi  ganuna  based  upon  similar 
work  done  with  E.  coli  IF-2  (4,5),  were  expressed  to 
different  levels  in  E.  coli. 

Growth  curves  and  relative  amounts  of  protein  on 
polyacrylamide  gels  indicated  that  the  G-domain  was  produced 
and  tolerated  extremely  well  by  E.  coli.  In  contrast,  the 
IF-2chi  gamma  greatly  retarded  growth  of  E.  coli  and  its 
relative  amount  in  the  cells  peaked  at  about  one  hour.  A 
possible  explanation  for  this  phenomena  is  that  the  gcunma 
form  possesses  some  IF-2  activity  (either  binding  initiator 
tRNA  or  interacting  with  the  ribosome)  which  is  toxic  to  E. 
coli.  Based  on  work  in  E.  coli,  the  G-domain  would  not  be 
expected  to  have  any  function  except  weak  binding  of  GDP  (4) 
which  would  have  a  limited  effect  on  cell  growth. 

These  two  proteins  were  shown  to  have  the  6X  His  tag, 
which  is  part  of  the  expression  vector,  based  on  sequencing 
of  the  DNA  and  their  affinity  for  Ni-NTA  resin  under 
denaturing  conditions.  Both  proteins  were  bound  by  the  Ni- 
NTA  resin  with  high  selectivity  under  denaturing  conditions, 
probably  due  to  the  accessibility  of  the  6X  His  tag. 

However,  under  native  conditions  the  G-domain  was  not  bound 
as  selectively. 
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Before  an  analysis  of  the  purification  of  the  G-domaln 
can  begin,  the  necessity  of  its  purity  must  be  discussed. 

The  only  IF-2  type  activity  found  for  the  E.  coli  IF-2  G- 
domain  was  GDP  binding  that  was  much  weaker  than  native  IF-2 
(4) .  Therefore,  any  trace  contamination  of  E.  coli  IF-2 
would  drastically  affect  any  binding  assays  attempted  with 
the  IF-2cm  G-domain.  Due  to  this  complication,  the 
purification  scheme  developed  had  to  be  carried  out  without 
an  assay  for  the  factor,  and  relied  only  on  the  physical 
monitoring  of  the  protein. 

The  Ni-NTA  affinity  column  did  not  nearly  duplicate  the 
separation  of  the  native  G-domain  protein  in  comparison  to 
the  denatured  form.  There  are  several  explanations  for  this 
phenomena.  As  alluded  to,  the  6X  His  tag  is  probably  more 
accessible  in  the  denatured  form,  so  the  native  form  is 
bound  less  selectively.  Also,  the  wash  was  ineffective  due 
to  the  buffer  system  as  illustrated  by  SDS-PAGE.  Finally, 
some  endogenous  E.  coli  proteins  may  have  adjacent 
histidines  in  their  native  state  which  are  bound  by  the 
resin. 

The  TSKgel  DEAE-5PW  HPLC  column  provided  excellent 
separation,  accept  for  those  few  proteins  which  were  so 
similar  electrostatically  that  they  co-eluted.  The  use  of 
the  TSKgel  SP-5PW  HPLC  column  at  pH  6.0  should  have  allowed 
evasion  of  this  similarity  based  on  a  normal  pKa  for 
histidine  between  pH  6.0  and  7.0.  However,  upon  later 


53 


computer  analysis  of  the  entire  seqi'.ence  a  calculated  pKa  of 
5.29  was  discovered. 

Based  on  experience  gained  in  this  investigation,  the 
following  changes  to  this  scheme  in  future  experiments  would 
probably  yield  pure  G-domaln.  Growth  of  a  larger  cell 
culture  and  a  more  stringent  wash  of  the  Ni-NTA  affinty 
column  would  enhance  the  first  step.  Forego,  the  first  DEAE 
HPLC  column;  and,  assuming  no  solubility  problems,  lower  the 
pH  of  the  protein  solution  to  5.0  before  applying  it  to  the 
TSKgel  SP-5PW  HPLC  column.  These  changes  should  enhance  the 
purity. 

The  expression  and  purification  described  above  on  the 
G-domain  and  gamma  form  of  IF-2chi  lays  the  foundation  for  a 
host  of  future  experiments.  The  G-domain  can  be  assayed  for 
its  ability  to  bind  GDP,  as  well  as  its  ability  to  hydrolyze 
GTP.  When  the  entire  sequence  of  IF-2chi  is  cloned,  a  full 
length  expression  construct  as  well  as  truncated  forms 
corresponding  in  size  to  E,  coli  IF-2  alpha  and  beta  can  be 
designed  and  expressed.  In  vitro  assays  of  IF-2chi  activity 
can  be  performed  with  all  the  proteins  to  determine  if  IF- 
2<.hi  is  active  as  a  monomer,  and  if  its  relationship  to  E. 
coli  IF-2  extends  to  its  structure  and  function  (i.e.  is  IF- 
2chjL  gamma  active  like  E.  coli  IF-2  gamma?)  . 

The  design,  expression,  and  purification  of  two 
truncated  forms  of  IF-2chi  is  the  first  small  step  in 
understanding  how  the  structure  of  chloroplast  initiation 
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factor  2  from  E.  gracilis  dictates  its  role  in  translational 
initiation. 
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APPBMDIX  JV 


Sttquenca  Comparison  Of  IF-*2  From  E.  coll  And  Chloroplasts 

Of  E.  grtLcilla 
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APPENDIX  B 

Known  DNA  And  Translatod  Amino  Acid  Soqattncn  Of  xr-2^ 


TTCCAGTCCTCTGGCAGCCCTATCAAGCCCCGCATCAACCTTGACCGCCCCTCCACCTCC 

1 - + - + - + - + - + - +  60 

FQSSGSPIKPRINLDRPSTS 

ACCCCAACGCCCCCGGAAGCACCGACCAGCCCCTCAGCCCGCCAGCCGGTGACGCAAGTG 

61 - + - + - + - + - + - +  120 

TPTPPEA'PTSPSARQPVTQV 

CCCCAGGCGAACAGCGTCCCTGCTGGAGCGGTAGCATCTCAAGCAGAAGTTAAGAAGCCG 

121 - + - + - + - + - + - +  180 

PQANSVPAGAVASQAEVKKP 

GCAGACCCCCAGCCCCCGGCCACCCCCTCCGCGCCGGTGCTGCGGCGCCCCGTCCGCACC 

181 - + - + - + - + - + - +  240 

ADPQPPATPSAPVLRRPVRT 

GCCATGCCGGCCTCGCCGCCCCGGATGGTGATCAACCTCGACGACATCCCCGACCGCTCC 

241  - + - + - + - + - + - +  300 

AMPASPPRMVINLDDIPDRS 

AAGCCGGTGTGGCCCGCCCCCCCGCCCCGAGCAAAGGGGCAAGGGGGCGGCAAGGGCGGC 

301  - + - + - + - + - + - +  360 

KPVWPAPPPRAKGQGGGKGG 

AAAGGCGGCAAAGGCGGCAAAGGCGGCAAAGGTGGCAAGGGGGACCGCGAGCAGCCGGCG 

361  - + - + - + - + - + - +  420 

KGGKGGKGGKGGKGDREQPA 

GTCGTGCGGCGGGCAAAACCACGGAGGACGGCAAGCACAGCCGAAGGGCCCGCCGCGGAG 

421  - + - + - + - + - + - +  480 

VVRRAKPRRTASTAEGPAAE 

TCCAAAGAAAGCGGAGGACGCGAAGCCCAAATTTGGGTGACGCCCAAGGGTGGAAAGGGC 

481  - + - + - + - + - + - +  540 

SKESGGREAQIWVTPKGGKG 

CGTGACAAGTGGAAGAAAGGAAAGGAGGAAGTCGACAAGAGCGAGGCGCTGCTGTTGAAA 

541  - + - + - + - + - + - +  600 

RDKWKKGKEEVDKSEALLLK 

GCCCGAAAGAAGACGCGCCTGGAGCGAAAGGAACGCCGCGAGGAGGTGCGGGAGGCGAAC 

601  - + - + - + - + - + - +  660 

ARKKTRLERKERREEVREAN 

GCCGCCAAGAAGGAAGAGATTATCGAAGTTGGGCCGCAGGGGATGACCGTCAGCGAGATT 

661 - 1 - - - 1 - 1 - ■! - - +  720 

AAKKEEIIEVGPQGMTVSEI 

GCCGGCAAACTCGCCATCACCCCTGCCAACGTGGTGACAGTCCTCTTCAAGAAGGGCATC 

721  - + - + - + - + - + - +  780 

AGKLAIT.PANVVTVLFKKGI 


781 


841 

901 

961 

1021 

1081 

1141 

1201 

1261 

1321 

1381 

1441 

1501 


ATGAGTGCCCCGAGCCAGACGATTGCGTACGACCTCGTCAAGATTGTGTGCGATGAGTAC 

- + - + - + - + - + - +  840 

MSAPSQTIAYDLVKIVCDEY 

AAAGTGGAAGTTCTTGAAGTTGAGGAAGAAGACGGGATCGCGTCCATGGAGGACCGCTTC 

- + - + - + - + - + - +  900 

KVEVLEVEEEDGIASMEDRF 

GTGCTGGACGAGGAGGCCGAAGCGTTGGTGTCCCGCCCGCCGGTGGTGACCATCATGGGC 

- — + - + - + - + - + - +  960 

VLDEEAEALVSRPPVVTIMG 

CACGTGGACCACGGCAAGACCTCGCTGCTGGACTACATCCGCAAGTCCAACGTTGTCGCC 

- + - + - + - + - + - +  1020 

HVDHGKTSLLDYIRKSNVVA 

GGGGAGGCGAGCGGCATCACCCAGGCCATTGGCGCGTACCACGTCGAATTCGCATCTCCC 

- + - + - + - + - + - +  1080 

GEASGITQAIGAYHVEFASP 

ACGGACGGCACCCCGACATTCATCTCATTCATCGACACGCCGGGGCACGAGGCCTTCACG 

- + - + - + - + - + - +  1140 

TDGTPTF-ISFIDTPGHEAFT 

GCGATGCGGGCCCGCGGGGCTACCGTGACGGATATCACCATCATCGTGGTGGCAGCAGAC 

- + - + - + - + - + - +  1200 

AMRARGATVTDITIIVVAAD 

GACGGGGTGCGGCCCCAGACCAAGGAGGCTATTGCTCACTGCAAGGCCGCTGGGGTGCCA 

— — — — — — — — - ——4.— 1260 

DGVRPQTKEAIAHCKAAGVP 

ATGGTGGTGGCGATCAACAAGATCGACAAGGACGGCGCGGACCCGGAGCGGGTGATGAAC 

- 1 - 1 - 1 - H - — - H - — — +  1320 

MVVAINKIDKDGADPERVMN 

GAGCTGGCGCAGGCGGGGCTGGTGCCGGAGGAGTGGGGCGGCGAGGTGCCGACGGTGAAG 

- 1 - 1 - — - 1 - 4 - — - 1 - — - -—4-  1380 

ELAQAGLVPEEWGGEVPTVK 

ATCAGCGCCAAGAAGGGCCTCGGCATCAAGGAGCTGCTGGAGATGATCCTCCTCACCGCG 

- 4. - 4. - 4. - 4. - 4. - 4.  1440 

ISAKKGLGIKELLEMILLTA 

GAGGTGGCGGACCTGAAGGCCAACCCCGCGGCCCCCGCGGAGGGCACTGTCATTGAGGCA 

- 4. - 4. - 4. - 4. - 4. - 4.  1500 

EVADLKANPAAPAEGTVIEA 

TATTTGGACCGAACACGCGGGCCGGTTGCGACGGTGCTCGTCCAAAACGGCACTCTGCGG 

- 1 - 1 - 1 - 1 - 1 - (.  1560 

YLDRTRGPVATVLVQNGTLR 


GCGGGCGACGTGGTAGTCACCAACGCTACCTGGGGCCGGGTGCGGGCCATCATGGACGAG 

1561 - H - - — ^ - - - 1520 

AGDVVVTNATWGRVRAIMDE 

AAGGGGGCAATGCTGGAGGCTGCGCCCCCGTCGCTGCCCGTCCAAGTGCTCGGCCTGGAC 

1621 - - - H” — — — — - + - - 1580 

KGAMLEAAPPSLPVQVLGLD 

GACGTCCCAGCCGCTGGGGACAAGTTTGAGGTCTACGCGTCGGAGAAGGAGGCGAGGGAC 

1681 - + - + - r - + - + - +  1740 

DVPAAGDKFEVYASEKEARD 

AAGGTGGACGAGTTTGAGCGGACCAAGAAGGAGAAGAATTGGGCGTCGCTGGCGTCCCGG 

1741  - + - + - + - + - + - +  1800 

KVDEFERTKKEKNWASLASR 

GACTTGGTGCGGCTGGACAACAACGCGGATGGGAAGGGGTTGGAGGTGATGAACGTCATC 

1801  - + - + - + - + - + - +  1860 

DLVRLDNNADGKGLEVMNVI 

CTCAAGACCGACGTCTCCGGGTCCTGCGAGGCCATCCGGGCGGCGCTGGACACCCTGCCC 

1861  - + - + - + - + - + - +  1920 

LKTDVSGSCEAIRAALDTLP 

CAGACCAAGATCGAGCTGCGCCTGATCCTGGCCTCCCCGGGGGACATCACCGTCTCCGAT 

1921  - + - + - + - + - + - +  1980 

QTKIELRLILASPGDITVSD 

GTCAACCTTGCTGCTTCCACGGGTAGCATCATCCTGGGCTTCAACGTGGACACGTTTTCT 

1981  - + - + - + - + - + - +  2040 

VNLAASTGSIILGFNVDTFS 

GCCGCGGAGGCGCTCATCAAGAACCTTGGCATCAAGTGCATGACGTTTGATGTCATCTAC 

2041  - + - +- - + - + - + - +  2100 

AAEALIKNLGIKCMTFDVIY 

GACCTCGTGGACCAGATGAAGGCAGTAATGGAAGGCAAGTTGGGCGACGAGCAGATCCCG 

2101 - + - + - + - + - + - +  2160 

DLVDQMKAVMEGKLGDEQIP 

GAGAAGGCCGGGGAGGCGGAGGTGAAGGCGGTGTTCGCGGCGCGGAACGGGAAGAAGGCA 

2161  - + - + - + - + - + - +  2220 

EKAGEAEVKAVFAARNGKKA 

GCCGGGTGCCTGGTGGTCGCGGGCCGCCTGGTCGCCCCCGCGTTCATCGAGGTGCTGCGG 

2221  - + - + - + - + - + - +  2280 

AGCLVVAGRLVAPAFIEVLR 

AAGAAGAAGATCCTGTTTTCGGGGCAGCTGTTCCAACTGCGGCGGATGAAGGACAATGTC 

2281  - + - + - + - + - + - +  2340 

KKKILFSGQLFQLRRMKDNV 


2341 


AGCGAGGTGGGCACCGACACCGAGTGCGGTGTCACCCTTGACGACTTCGACGACTGGCAG 

- + - + - + - + - + - +  2400 

SEVGTDTECGVTLDDFDDWQ 

GAGGGCGACCGCATCGTGTGCTACAGCACGGTCACCCGGCAACGGGCCCTCGAGGCGACC 

2401  - + - + - + - + - + - +  2460 

EGDRIVCYSTVTRQRALEAT 

CCCGCT 

2461  -  2466 

PA- 


